{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Neo4j Property Graph Index\n",
    "\n",
    "<a href=\"https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/property_graph/property_graph_neo4j.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
    "\n",
    "\n",
    "Neo4j is a production-grade graph database, capable of storing a property graph, performing vector search, filtering, and more.\n",
    "\n",
    "The easiest way to get started is with a cloud-hosted instance using [Neo4j Aura](https://neo4j.com/cloud/platform/aura-graph-database/)\n",
    "\n",
    "For this notebook, we will instead cover how to run the database locally with docker.\n",
    "\n",
    "If you already have an existing graph, please skip to the end of this notebook."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install llama-index llama-index-graph-stores-neo4j"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Docker Setup\n",
    "\n",
    "To launch Neo4j locally, first ensure you have docker installed. Then, you can launch the database with the following docker command\n",
    "\n",
    "```bash\n",
    "docker run \\\n",
    "    -p 7474:7474 -p 7687:7687 \\\n",
    "    -v $PWD/data:/data -v $PWD/plugins:/plugins \\\n",
    "    --name neo4j-apoc \\\n",
    "    -e NEO4J_apoc_export_file_enabled=true \\\n",
    "    -e NEO4J_apoc_import_file_enabled=true \\\n",
    "    -e NEO4J_apoc_import_file_use__neo4j__config=true \\\n",
    "    -e NEO4JLABS_PLUGINS=\\[\\\"apoc\\\"\\] \\\n",
    "    neo4j:latest\n",
    "```\n",
    "\n",
    "From here, you can open the db at [http://localhost:7474/](http://localhost:7474/). On this page, you will be asked to sign in. Use the default username/password of `neo4j` and `neo4j`.\n",
    "\n",
    "Once you login for the first time, you will be asked to change the password.\n",
    "\n",
    "After this, you are ready to create your first property graph!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Env Setup\n",
    "\n",
    "We need just a few environment setups to get started."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.environ[\"OPENAI_API_KEY\"] = \"sk-proj-...\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!mkdir -p 'data/paul_graham/'\n",
    "!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import nest_asyncio\n",
    "\n",
    "nest_asyncio.apply()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/loganmarkewich/Library/Caches/pypoetry/virtualenvs/llama-index-caVs7DDe-py3.11/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
      "  from .autonotebook import tqdm as notebook_tqdm\n"
     ]
    }
   ],
   "source": [
    "from llama_index.core import SimpleDirectoryReader\n",
    "\n",
    "documents = SimpleDirectoryReader(\"./data/paul_graham/\").load_data()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Index Construction"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.graph_stores.neo4j import Neo4jPropertyGraphStore\n",
    "\n",
    "# Note: used to be `Neo4jPGStore`\n",
    "graph_store = Neo4jPropertyGraphStore(\n",
    "    username=\"neo4j\",\n",
    "    password=\"llamaindex\",\n",
    "    url=\"bolt://localhost:7687\",\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Parsing nodes: 100%|██████████| 1/1 [00:00<00:00, 21.63it/s]\n",
      "Extracting paths from text with schema: 100%|██████████| 22/22 [01:06<00:00,  3.02s/it]\n",
      "Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.06it/s]\n",
      "Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.89it/s]\n"
     ]
    }
   ],
   "source": [
    "from llama_index.core import PropertyGraphIndex\n",
    "from llama_index.embeddings.openai import OpenAIEmbedding\n",
    "from llama_index.llms.openai import OpenAI\n",
    "from llama_index.core.indices.property_graph import SchemaLLMPathExtractor\n",
    "\n",
    "index = PropertyGraphIndex.from_documents(\n",
    "    documents,\n",
    "    embed_model=OpenAIEmbedding(model_name=\"text-embedding-3-small\"),\n",
    "    kg_extractors=[\n",
    "        SchemaLLMPathExtractor(\n",
    "            llm=OpenAI(model=\"gpt-3.5-turbo\", temperature=0.0)\n",
    "        )\n",
    "    ],\n",
    "    property_graph_store=graph_store,\n",
    "    show_progress=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that the graph is created, we can explore it in the UI by visting [http://localhost:7474/](http://localhost:7474/). \n",
    "\n",
    "The easiest way to see the entire graph is to use a cypher command like `\"match n=() return n\"` at the top.\n",
    "\n",
    "To delete an entire graph, a useful command is `\"match n=() detach delete n\"`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Querying and Retrieval"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Interleaf -> Got crushed by -> Moore's law\n",
      "Interleaf -> Made -> Scripting language\n",
      "Interleaf -> Had -> Smart people\n",
      "Interleaf -> Inspired by -> Emacs\n",
      "Interleaf -> Had -> Few years to live\n",
      "Interleaf -> Made -> Software\n",
      "Interleaf -> Had done -> Something bold\n",
      "Interleaf -> Added -> Scripting language\n",
      "Interleaf -> Built -> Impressive technology\n",
      "Interleaf -> Was -> Company\n",
      "Viaweb -> Was -> Profitable\n",
      "Viaweb -> Was -> Growing rapidly\n",
      "Viaweb -> Suggested -> Hospital\n",
      "Idea -> Was clear from -> Experience\n",
      "Idea -> Would have to be embodied as -> Company\n",
      "Painting department -> Seemed to be -> Rigorous\n"
     ]
    }
   ],
   "source": [
    "retriever = index.as_retriever(\n",
    "    include_text=False,  # include source text in returned nodes, default True\n",
    ")\n",
    "\n",
    "nodes = retriever.retrieve(\"What happened at Interleaf and Viaweb?\")\n",
    "\n",
    "for node in nodes:\n",
    "    print(node.text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Interleaf had smart people and built impressive technology but got crushed by Moore's Law. Viaweb was profitable and growing rapidly.\n"
     ]
    }
   ],
   "source": [
    "query_engine = index.as_query_engine(include_text=True)\n",
    "\n",
    "response = query_engine.query(\"What happened at Interleaf and Viaweb?\")\n",
    "\n",
    "print(str(response))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Loading from an existing Graph\n",
    "\n",
    "If you have an existing graph (either created with LlamaIndex or otherwise), we can connect to and use it!\n",
    "\n",
    "**NOTE:** If your graph was created outside of LlamaIndex, the most useful retrievers will be [text to cypher](/../../module_guides/indexing/lpg_index_guide#texttocypherretriever) or [cypher templates](/../../module_guides/indexing/lpg_index_guide#cyphertemplateretriever). Other retrievers rely on properties that LlamaIndex inserts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.graph_stores.neo4j import Neo4jPropertyGraphStore\n",
    "from llama_index.core import PropertyGraphIndex\n",
    "from llama_index.embeddings.openai import OpenAIEmbedding\n",
    "from llama_index.llms.openai import OpenAI\n",
    "\n",
    "graph_store = Neo4jPropertyGraphStore(\n",
    "    username=\"neo4j\",\n",
    "    password=\"794613852\",\n",
    "    url=\"bolt://localhost:7687\",\n",
    ")\n",
    "\n",
    "index = PropertyGraphIndex.from_existing(\n",
    "    property_graph_store=graph_store,\n",
    "    llm=OpenAI(model=\"gpt-3.5-turbo\", temperature=0.3),\n",
    "    embed_model=OpenAIEmbedding(model_name=\"text-embedding-3-small\"),\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "From here, we can still insert more documents!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core import Document\n",
    "\n",
    "document = Document(text=\"LlamaIndex is great!\")\n",
    "\n",
    "index.insert(document)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Llamaindex -> Is -> Great\n"
     ]
    }
   ],
   "source": [
    "nodes = index.as_retriever(include_text=False).retrieve(\"LlamaIndex\")\n",
    "\n",
    "print(nodes[0].text)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For full details on construction, retrieval, querying of a property graph, see the [full docs page](/../../module_guides/indexing/lpg_index_guide)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "llama-index-bXUwlEfH-py3.11",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
